Optimality guarantees for distributed statistical estimation

نویسندگان

  • John C. Duchi
  • Michael I. Jordan
  • Martin J. Wainwright
  • Yuchen Zhang
چکیده

Large data sets often require performing distributed statistical estimation, with a full data set split across multiple machines and limited communication between machines. To study such scenarios, we define and study some refinements of the classical minimax risk that apply to distributed settings, comparing to the performance of estimators with access to the entire data. Lower bounds on these quantities provide a precise characterization of the minimum amount of communication required to achieve the centralized minimax risk. We study two classes of distributed protocols: one in which machines send messages independently over channels without feedback, and a second allowing for interactive communication, in which a central server broadcasts the messages from a given machine to all other machines. We establish lower bounds for a variety of problems, including location estimation in several families and parameter estimation in different types of regression models. Our results include a novel class of quantitative data-processing inequalities used to characterize the effects of limited communication.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiple Optimality Guarantees in Statistical Learning

Multiple Optimality Guarantees in Statistical Learning by John C Duchi Doctor of Philosophy in Computer Science and the Designated Emphasis in Communication, Computation, and Statistics University of California, Berkeley Professor Michael I. Jordan, Co-chair Professor Martin J. Wainwright, Co-chair Classically, the performance of estimators in statistical learning problems is measured in terms ...

متن کامل

Optimal Simple Step-Stress Plan for Type-I Censored Data from Geometric Distribution

Abstract. A simple step-stress accelerated life testing plan is considered when the failure times in each level of stress are geometrically distributed under Type-I censoring. The problem of choosing the optimal plan is investigated using the asymptotic variance-optimality as well as determinant-optimality and probability-optimality criteria. To illustrate the results of the paper, an example i...

متن کامل

Distributed Statistical Estimation and Rates of Convergence in Normal Approximation

This paper presents new algorithms for distributed statistical estimation that can take advantage of the divide-and-conquer approach. We show that one of the key benefits attained by an appropriate divide-and-conquer strategy is robustness, an important characteristic of large distributed systems. We introduce a class of algorithms that are based on the properties of the geometric median, estab...

متن کامل

Distributed Nonlinear Robust Control for Power Flow in Islanded Microgrids

In this paper, a robust local controller has been designed to balance the power for distributed energy resources (DERs) in an islanded microgrid. Three different DER types are considered in this study; photovoltaic systems, battery energy storage systems, and synchronous generators. Since DER dynamics are nonlinear and uncertain, which may destabilize the power system or decrease the performanc...

متن کامل

Computational Limits of A Distributed Algorithm for Smoothing Spline

In this paper, we explore statistical versus computational trade-off to address a basic question in the application of a distributed algorithm: what is the minimal computational cost in obtaining statistical optimality? In smoothing spline setup, we observe a phase transition phenomenon for the number of deployed machines that ends up being a simple proxy for computing cost. Specifically, a sha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014